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Abstract 

This  paper  proposes  an  estimator  of  the  covariance  matrix  of  curren- 
cies using  unsychronized  and  noisy  high  frequency  observations.  The 
estimator  allows  us  to  estimate  the  covariance  matrix  over  a  shorter 
time  interval  with  more  accuracy.  The  estimator  is  not  f-consistent 
when  there  are  so-called  observation  noises.  Increasing  observation  fre- 
quency infinitely  does  not  always  increase  the  accuracy  of  the  estima- 
tion. Optimal  observation  frequency  is  dependent  on  the  ratio  of  the 
total  volatility  over  the  noise  level.  Daily  covariance  matrices  of  three 
exchange  rates  are  calculated  to  demonstrate  the  methodology.  The 
empirical  results  show  that  the  correlations  of  the  three  currencies  are 
strong  but  various  over  time. 


Key  Words:  f-consistency;  observation  noise;  volatility. 


1     Introduction 

The  estimation  of  the  covariance  matrix  of  financial  prices  is  necessary  in  port- 
folio optimization  and  risk  management.  Besides  sample  covariance,  many- 
other  estimators  have  been  proposed  (Stein  1975,  Dey  and  Srinivasan  1985). 
However,  estimating  the  covariance  matrix  from  daily  data  can  have  serious 
problems.  Jobson  and  Korkie  (1980)  indicated  that,  in  some  cases,  it  is  better 
to  use  the  identical  matrix  instead  of  the  sample  covariance  matrix  in  the  port- 
folio selection.  The  problem  is  that  the  number  of  observations  is  not  enough 
to  estimate  all  entries  of  a  big  covariance  matrix.  To  get  around  the  problem, 
one  may  want  to  collect  more  data  over  longer  time  interval.  However,  the 
changing  condition  of  markets  may  prevent  us  to  do  so.  Another  approach  is 
to  impose  constrains  on  the  covariance  matrix  to  reduce  the  number  of  free 
parameters  (Frost  and  Savaino,  1986).  The  constrain  may  be  subjective  and 
not  reflect  the  reality  of  the  market.  This  paper  explores  another  possibility 
of  using  high  frequency  data.  Because  of  fast-growing  computer  power,  data 
is  now  available  in  ultra-high  frequency,  such  as  tick-by-tick.  Exchange  rates, 
for  example,  can  easily  have  over  one  million  observations  in  one  year. 

There  are  several  difficulties  in  using  high  frequency  data.  The  first  one  is 
the  issue  of  unsynchronized  time.  The  price  or  quote  of  each  cmrrency  or  stock 
comes  at  different  times.   The  second  difficulty  is  so-called  observation  noise 


(Zhou  1995a).  The  high  frequency  time  series  can  be  viewed  as  observations 
from  a  continuous  process  with  observation  noises: 

S{U)  =  P{U)  +  €t„     Ue[a,b],  (1) 

where  P{t)  is  a  diffusion  process 

dP{t)  =  fi{t)  +  a{t)dWt.  (2) 

The  P{t)  is  referred  to  as  a  signal  process.  The  noise  Ct,  only  occurs  at  the  time 
of  observation.  Ct,  behaves  irregularly  and  no  distributional  assumptions  have 
been  made  about  the  noises.  The  representation  captured  many  characteris- 
tics, such  as  strong  negative  autocorrelations  in  high  frequency  observations. 
The  present  of  observation  noises  can  cause  great  difficulties  in  constructing 
f-consistent  (Zhou  1995b)  estimators.  For  an  estimator  without  f-consistency, 
increasing  observation  frequency  infinitely  can  do  more  harm  than  good. 

In  the  next  section,  I  will  concentrate  on  constructing  the  estimators  for 
the  covariance  matrix  of  unsynchronized  financial  time  series.  Without  loss  of 
generality,  I  will  only  discuss  estimating  the  covariance  of  two  financial  time 
series.  The  variance  estimators  can  be  found  in  Zhou  (1995a,  1995b).  In  the 
last  section,  1  will  gives  the  estimates  of  daily  covariance  and  daily  correlation 
matrix  of  three  exchange  rates. 


2      Estimating   The   Covariance   Using   Unsy- 
chronized  Data 

Suppose  that  two  time  series  {Sx{U)}  and  {S'y(sj)}  are  observations  from 
two  processes  of  (1)  with  observation  noise.  To  further  simphfy  the  problem, 
I  assume  that  the  diffusion  process  P{t)  is  a  Brownian  motion  with  a  time 
deformation: 

SxiU)  =  Mti)  +  ^x{rx{U))  +  exiU),    a<U<b 
Sy{Sj)  =  Hvist)  +  Wx{TY{s^))  +  eyisi),    a  <  Si  <  b. 
^xiU),  fy('Si)  Wxir)  and  Wy{t)  are  all  independent.  The  parameter  of  interest 
is  the  covariance  of  two  Brownian  motions  Wx{t)  and  Wy{t)  over  time  interval 
[a,b]: 

c{a,  b)  =  Cov{Wxib)  -  Wx{a),  Wyib)  -  Wyia))  (3) 

Two  processes  Wx{t)  and  Wx{t)  are  assumed  to  have  no  leading  effect,  i.e., 

for  Ti   <T2  <T3  <  Ti 

Cov{Wx{t{)  -  Wx{T2),Wy{T^)  -  Wy{n))  =  0,    and 

(4) 

C0Y{Wy{n)  -  Wy{T2),Wx{Ts)  -  Wxiu))  =  0. 

Under  assumption  (4), 

c{s,  t)  +  c{t,  u)  =  c(s,  u),  s  <  t  <  u. 


Without  making  any  assumption  about  the  regularity  of  time  sequence 
{ti]  and  {.Sj}  and  the  regularity  of  the  covariance  of  each  pair  of  Sx{ti)  — 
SxiU-i)  and  Syitj)  —  S'y(ij_i),  it  is  very  difficult  to  construct  a  maximum 
likelihood  estimator  of  the  covariance  over  time  [a,  b]  using  observations  within 
the  interval.  Instead,  I  propose  the  following  unbised  estimator: 

c{a,  6)  =    E   (SxiUnj))  -  SxiU-u-iMSvis,)  -  Sx{sj-i)),  (5) 

a<Sj<b 

where 

i'^ij)  =  mm{i  :  ti  >  Sj}    and   i~{j)  =  ma.x{i  :  U  <  Sj}. 

Notice  that  the  subscripts  i  and  j  are  interchangeable.  That  is,  estimator  (5) 
can  also  be  written  as 

c{a,b)=    ^  (SxiU)  -  Sx{U-MSy{s,.^i^)  -  Sx{sj-^,^,))),  (6) 

a<f,<6 

When  we  have  synchronized  times,  the  estimator  is  simply  the  sample  co- 
variance.  To  save  computation  time,  series  {5y(sj)}  should  have  fewer  data 
points. 

Theorem  1    The  estimator  (5)  is  unbiased,  i.e., 

Ec(a,  b)  =  c{a,  b) 

The  proof  is  straight  forward  and  can  be  found  in  the  Appendix. 


There  are  no  distributional  assumptions  being  made  about  the  noises.  They 
can  be  exotic  or  autocorrelated.  The  time  deformation  function  Tx{t)  and 
ty{s)  do  not  need  to  be  known.  Therefore,  the  volatihties  in  each  tick  do  not 
have  to  be  constant  or  given.  Without  the  presence  of  noises,  the  estimator 
is  f-consistent.  However,  when  there  are  nonneghgible  noises,  the  variance  of 
the  estimator  diverges. 

Theorem  2  LetZxiU+a))  =  Wx{ti+{j))-Wx{ti- (j-i))  andZyisj)  =  Wy{sj)- 
WV(tj_i).  If  all  ex {ti)  and  eyisj)  are  zeros  and  maxi  varZx{ti+{j))  — »■  0,  then 

var{c{a,b))=    ^    var(Zx(ii+o))ZK(sj))  ^  0.  (7) 

a<Sj<b 

Otherwise, 

var{c{a,  b))  >  ^  var{ex{ti+^j)))var{eY{sj)).  (8) 

j 

The  proof  is  given  in  the  Appendix. 

If  the  majority  of  noises  is  nonzero,  the  right-hand  side  of  equation  (8) 

approaches  infinity  as  the  observation  frequency  increases.  On  the  other  hand, 

if  the  frequency  is  too  low,  the  right-hand  side  of  equation  (7)  stays  high.  There 

is  an  optimal  observation  frequency  to  minimize  the  variance  of  the  estimator 

(5).    To  find  such  an  optimal  observation  frequency  without  assuming  any 

regularity  of  time  sequences  {ti}  and  {sj}  is  complicated.  To  get  some  ideas 


about  this  optimal  observation  frequency,  I  discuss  only  the  following  simple 
example: 

1.  Time  series  SxiU)  have  size  m  +  1  and  Syisj)  have  size  n  +  I,  m  =  kn. 

Sj_i  <  ^(j-i)fc  <  i(j-i)A:+i, ...,  <  tjk-i  <  Sj,i  =  1, ..,  n 

except 

to  =  a,tm  =  b,  So  =  a,Sn  =  b. 

It  is  easy  to  see  that  i'^{j)  =  jk  and  i^{j  —  1)  =  (j  —  l)k  —  1. 

2.  Let  Zx{U)  =  WxiU)  -  Wx{U-i)  and  Zy{s,)  =  Wyisi)  -  Wyisi^i).  The 
variances  of  signal  changes  are  all  constant 

2  2 

var{Zx{ti))  =  — ^    and   var{ZY{sj))  =  — 
where  aj^  =  Tx{b)  —  Tx{a)  and  ay  =  Ty(fc)  —  Tyia). 

3.  The  noises  have  no  autocorrelations  and 


var(r7x(it))  =  Vx    and   var(r/y(si))  =  r/y. 


2      2 

var(Zx(^^)Zy(s,))  =  aal{U)al.{sj)  =  a""^""^ 


n? 


where  a  is  a  constant  between  1  and  2; 


Under  these  conditions,  the  variance  of  the  estimator  (5)  is: 

1         2      2  2      2 

var(c(a,  b))  =  q(1  +  t)^^^  +  S^^y  +  '^<^Wx  +  2^^^  +  ^nrjWx      (9) 

K         7v  Tfl 

where  a  is  a  constant  between  1  and  2.  The  proof  of  the  equation  is  given  in 
the  Appendix. 

Prom  (9),  the  optimal  observation  frequency  n  is  near 


sj{a{l  +  l/k)rxrY  +  2rx/A;)/4,  where  rx  =  crj^/Vx  and  ry  =  (Tyhy,  the 
signal-to-noise  ratios.    When  the  size  of  one  series  is  much  bigger  than  the 


size  of  the  other  one,  the  optimal  observation  frequency  is  near  ^JarxTY/^- 
The  optimal  observation  frequency  is  proportional  to  the  signal-to-noise  ratio. 
When  there  is  high  level  of  noises,  high  frequency  data,  such  as  tick-by-tick, 
may  have  too  many  data  points  to  use  the  estimator  (5).  Of  course  we  can 
throw  out  some  data  points.  However,  to  utilize  all  the  data  points,  I  propose 
the  following  estimator.  Again  assume  that  {SxiU)}  has  m  data  points  and 
{S'y(si)}  has  n  data  points,  n  <  m.  n  is  k  times  larger  than  the  optimal 
observation  frequency.  I  can  first  use  SY{tj),j  =  0,  A:,2fc, ...  to  estimate  the 
covariance,  then  use  S'y(tj),j  =  1,  A;  -1-  1,  2A;  -1- 1, ...  to  estimate  the  covariance. 
Finally,  I  average  these  k  estimates.  In  summary, 

Ck{a,b)  =  lJ:    Y.    iSx{tt{j))-Sx{U-u-k)mSY{s,)-Sx{s,^k)),     (10) 

p=l  j=p{k)n 

This  estimator  is  still  not  f-consistent.   However,  the  upper  bound  of  the 


variance  of  the  estimator  (10)  becomes  finite  when  the  observation  frequency- 
approaches  to  infinity. 

3     Estimating  Covariance  and  Correlation  Ma- 
trices of  Three  Exchange  Rates 

In  this  section,  I  will  estimate  the  daily  covariance  and  correlation  matrices 
of  exchange  rates  from  tick-by-tick  data.  The  high  frequency  data  is  provided 
by  Olsen  &  Associates.  It  is  referred  to  as  the  HFDF93  data  set.  It  con- 
tains spot  rate  quotes  of  the  Deutsche  mark  and  US  dollar  (DEM/USD),  the 
Japanese  yen  and  US  dollar  (JPY/USD)  and  the  Deutsche  mark  and  Japanese 
yen  (JPY/DEM)  from  October  1,  1992  to  September  30,  1993.  The  data  is 
recorded  from  the  Reuters  screen.  DEM/USD  are  the  most  active  currencies 
traded  in  the  market  followed  by  JPY/USD.  Cross-rates,  like  JPY/DEM,  are 
much  less  active.  In  this  paper  only  the  bid  prices  are  used  because  the  bid 
price  is  quoted  in  its  entirety.  The  ask  price  is  quoted  by  the  last  two  or  three 
digits  only.  Interpreting  ask  price  by  computer  is  often  troublesome.  Since  the 
data  is  a  non-binding  quote,  the  level  of  observation  noise  is  very  high.  The 
basic  statistics  of  the  returns  of  the  three  exchange  rates  are  listed  in  Table  1. 
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Table  1:  Summary  Statistics  of  Tick- by-tick  Retm^ns 


Series  n        Min.       Max.  Mean  sd     Skew.     Kurt.  p 

DEM/USD     1472032     -.0066     .00544       9.92e-8     2.67e-4        .017       6.87     -.451 

JPY/USD        570689     -.0105     0.0104     -2.18e-7     3.72e-4      -.029     32.17     -.425 

JPY/DEM       158958    -.0098     0.0100     -1.70e-6     3.50e-4     -.385     23.26     -.107 

p:  the  first  order  autocorrelation. 

To  examine  if  there  are  any  lagged  correlations  among  the  three  exchange 
rates,  cross-correlations  of  daily  returns  are  calculated.  Figure  1  shows  that 
three  exchange  rates  have  significant  correlations,  but  there  axe  no  lagged 
correlations. 

To  calculate  the  correlation  matrix,  the  following  volatility  estimator  is 
used  (Zhou  1995a): 

^^''  =  \T.    E    (5(i,)-5(t,-,))(5(i,+,)-5(i,_2fc)).  (11) 

p=l  j=p{k)n 

and  the  level  of  noise  is  estimated  by  the  sample  autocovariance 

^'  =  -^E    E   {S{U)  -  S{U^,)){S{U-k)  -  S{U_2k)).  (12) 

p=l  i=p(k)n 

The  estimates  of  the  annual  volatilities,  the  variances  of  the  signals,  and  the 
noise  level  rf  are  listed  in  Table  2. 
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Figure  1:  Cross  Correlation  of  Three  Exchange  Rates  Using  Daily  Retinrns. 


Table  2:  Estimation  of  Signal  and  Noise  Levels 


Series 


n    k     Ann.  vol. 


7]"^         S-N  r 


DEM/USD     1472032     5  .0148     2.86e-8       517482 

JPY/USD        570689     3  .0153     5.35e-8      285981 

JPY/DEM       158958     1  .0153     1.30e-8     1176923 
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Prom  Table  2,  I  can  get  a  rough  estimate  of  the  optimal  observation  fre- 
quency for  each  pair  of  exchange  rates.  For  DEM/USD  and  JPY/USD,  the 
rough  estimate  is  about  260,000.  Compared  to  the  size  of  the  smaller  series 
JPY/USD,  it  is  less  than  one  half.  I  choose  A;  =  3  in  formula  (10)  to  estimate 
the  covariance.  Using  a  similar  argument,  /c  =  1  is  chosen  in  the  two  other 
pairs  of  exchange  rates.  The  daily  covariances  of  the  three  pairs  of  exchange 
rates  are  given  in  Figure  2. 

Since  the  market  volatility  changes  over  time,  the  change  of  the  covariance 
is  aifected  by  changing  volatihty;  therefore  a  correlation  matrix  may  be  more 
desirable  in  some  cases.  Using  volatility  estimator  (11),  daily  volatilities  are 
calculated  for  each  exchange  rate.  Daily  correlations  of  the  three  exchange 
rates  are  plotted  in  Figure  3. 

There  are  several  interesting  observations  from  Figure  3.  First,  the  cor- 
relation between  DEM/USD  and  JPY/USD  are  almost  always  positive.  This 
indicates  that  the  US  dollar  is  a  leading  currency.  When  it  moves,  it  moves  in 
the  same  direction  against  both  the  Deutsche  mark  and  the  Japanese  yen.  The 
Deutsche  mark  has  the  same  feature.  During  the  first  four  and  a  half  months 
of  the  time  interval,  both  the  US  dollar  and  the  Deutsche  mark  dominated  the 
market.  However,  after  February  of  1993,  the  Japanese  yen  became  a  domi- 
nating currency.  The  role  of  each  currency  changes  over  a  long  time  interval. 
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Figure  2:  Daily  Estimates  of  the  Covariance  of  Three  Exchange  Rates. 
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Figure  3:  Daily  Estimates  of  the  Correlation  of  Three  Exchange  Rates. 
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However,  they  are  relatively  stable  over  short  periods  such  as  months. 

Since  the  three  exchange  rates  are  dependent,  the  actual  covariance  matrix 
is  singular.  Buying  one  unit  of  DEM/USD  and  JPY/DEM  and  selling  one  unit 
of  JPY/DEM  end  a  neutral  position.  The  empirical  results  confirmed  that  the 
minimum  eigenvalue  of  estimated  monthly  covariance  matrices  are  very  close 
to  zero  for  all  twelve  months. 

4     Discussion 

High  frequency  data  provides  us  with  enough  data  not  only  to  estimate  a 
large  covariance  matrix,  but  also  to  estimate  the  variance  matrix  over  a  short 
time  interval.  This  is  extremely  beneficial  in  a  fast  changing  market.  It  en- 
ables people  to  see  the  market  change  quickly  and  adjust  their  portfolio  in 
time.  Of  course,  using  high  frequency  data  is  not  without  its  difficulties.  The 
nonsynchronized  time  causes  great  difficulty  in  getting  a  maximum  likelihood 
estimator  and  the  observation  noise  causes  great  difficulty  in  achieving  an 
f-consistency.  The  covariance  estimator  proposed  here  is  simple  with  few  as- 
sumptions. However,  if  one  is  willing  to  impose  more  conditions,  the  estimator 
can  be  improved. 

In  the  cmrency  market,  it  is  reasonable  to  assume  no  leading  correlation 
(4)  among  exchange  rates.   This  may  not  be  true  in  the  stock  market  when 

14 


small  stocks  are  considered.  Lo  (1990)  showed  that  there  is  asymmtretry  in  the 
stock  market,  i.e.,  big  stocks  may  lead  small  stocks.  The  proposed  estimator 
does  not  work  in  such  a  case.  However,  small  stocks  are  often  thinly  traded; 
they  are  not  the  focus  of  this  research. 
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Appendix: 

i)  Proof  of  Theorem  1,  the  unbiasness  of  estimator  (5): 
First,  I  write  the  change  of  prices  as  following: 

Sx{ti+u))  -  Sx(ti-!^j^i))  =  Zxiti+(j))  +  ^xiU+u))  -  ^x{ti-{j-i)),  and 
Sy{sj)  -  Svitj-i)  =  Zy{sj)  +  evisj)  -  ey(sj-i). 
Then 

Ec{a,b)    =    E    ^   [Zx{ti+(j))ZY{sj)  +  Zx{U+(j))eY{sj)- Zx{ti+(j))eY{sj-i) 

a<Sj<b 

+^x{ti+(j))ZY{Sj)  +  exiti+^j))eYisj)  -  ex{U+(j))eY{sj-i) 
-^xiti-u-i))ZY{sj)  -  ex{U-ij-i))€Y{sj)  +  6x(^,-(j-i))er(sj_i)] 


= 

^    EZ^ 

a<Sj<b 

iti+(j))ZY{Sj) 

= 

a<Sj<b 

-USj) 

= 

c{b,a) 

i)  Proof  of  Theorem  2,  the  f-consistency  of  estimator  (5): 

Use  the  same  notation  as  in  Theorem  1  and  let  a^(ti+(j))=var(Zx(ii+(j)))  and 

ay(sj)=var(Zy(si)).  When  there  are  no  noises 

var(c(a,6))    =    var(   ^    Zx{U+^j))Zy{sj)) 

a<Sj<b 


<       Y:    ^EZ'xiU.u))^ZUs,) 

a<Sj<6 
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=      Yl    \/3^x(*t+(j))3af(sj) 


a<Sj<b 


<    3 ma.x a x{ti+(j))y^^  ay (sj)  — >  0. 
j 

On  the  other  hand,  when  there  axe  noises, 

var(c(a,6))  >  var(   ^    ex(^t+(j))ey(sj)). 

a<Sj<6 

i)  Proof  of  equation  (9): 


var(c(a,  b))    =    var(^[(Zx(^(j-i)/c)  +  •••  +  Zx(ijfc))Zy(sj) 
1=1 

+{Zx{t(j-i)k)  +  •■•  +  Zx{tjk))eY{sj) 
-{Zx{t{j-i)k)  +  ••■  +  Zx{tjk))eY{sj-i) 
+exitjk)ZYisj)  +  exitjkyvisj)  -  ex{tjk)^Y{sj-i) 
-^x{tij-i)k-i)ZY{Sj)  -  ex(^o-i)fc-i)er(sj)  +  ex{tu-i)k-i)eY{sj-i)] 

=    a{l  +  y)^^  +  2aWY  +  '2^^  +  '2<^Wx+WYv'x 
k       n  m 
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